{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Sequence-to-sequence LSTM outlier detector deployment"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wrap a keras seq2seq-LSTM python model for use as a prediction microservice in seldon-core and deploy on seldon-core running on Minikube or a Kubernetes cluster using GCP."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Dependencies"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- [helm](https://github.com/helm/helm)\n",
    "- [minikube](https://github.com/kubernetes/minikube)\n",
    "- [s2i](https://github.com/openshift/source-to-image) >= 1.1.13\n",
    "\n",
    "Python packages:\n",
    "- keras: pip install keras\n",
    "- tensorflow: https://www.tensorflow.org/install/pip"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Task"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The outlier detector needs to spot anomalies in electrocardiograms (ECG's). The dataset \"ECG5000\" contains 5000 ECG's, originally obtained from [Physionet](https://physionet.org/cgi-bin/atm/ATM) under the name \"BIDMC Congestive Heart Failure Database(chfdb)\", record \"chf07\". The data has been pre-processed in 2 steps: first each heartbeat is extracted, and then each beat is made equal length via interpolation. The data is labeled and contains 5 classes. The first class which contains almost 60% of the observations is seen as \"normal\" while the others are outliers. The seq2seq-LSTM algorithm is trained on some heartbeats from the first class and needs to flag the other classes as anomalies. The plot below shows an example ECG for each of the classes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![ECGs](images/ecg.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train locally\n",
    "\n",
    "Train on some inlier ECG's. The data can be downloaded [here](http://www.timeseriesclassification.com/description.php?Dataset=ECG5000) and should be extracted in the [data](./data) folder."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python train.py \\\n",
    "--dataset './data/ECG5000_TEST.arff' \\\n",
    "--data_range 0 2627 \\\n",
    "--minmax \\\n",
    "--timesteps 140 \\\n",
    "--encoder_dim 20 \\\n",
    "--decoder_dim 40 \\\n",
    "--output_activation 'sigmoid' \\\n",
    "--dropout 0 \\\n",
    "--learning_rate 0.005 \\\n",
    "--loss 'mean_squared_error' \\\n",
    "--epochs 100 \\\n",
    "--batch_size 32 \\\n",
    "--validation_split 0.2 \\\n",
    "--print_progress \\\n",
    "--save"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The plot below shows a typical prediction (*red line*) of an inlier (class 1) ECG compared to the original (*blue line*) after training the seq2seq-LSTM model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![inlier_ecg](images/inlier_ecg.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On the other hand, the model is not good at fitting ECG's from the other classes, as illustrated in the chart below:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![outlier_ecg](images/outlier_ecg.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The predictions in the above charts are made on ECG's the model has not seen before. The differences in scale are due to the sigmoid output layer and do not affect the prediction accuracy."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Test using Kubernetes cluster on GCP or Minikube"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run the outlier detector as a model or a transformer. If you want to run the anomaly detector as a transformer, change the SERVICE_TYPE variable from MODEL to TRANSFORMER [here](./.s2i/environment), set MODEL = False and change ```OutlierSeq2SeqLSTM.py``` to:\n",
    "\n",
    "```python\n",
    "from CoreSeq2SeqLSTM import CoreSeq2SeqLSTM\n",
    "\n",
    "class OutlierSeq2SeqLSTM(CoreSeq2SeqLSTM):\n",
    "    \"\"\" Outlier detection using a sequence-to-sequence (seq2seq) LSTM model.\n",
    "    \n",
    "    Parameters\n",
    "    ----------\n",
    "        threshold (float) :  reconstruction error (mse) threshold used to classify outliers\n",
    "        reservoir_size (int) : number of observations kept in memory using reservoir sampling\n",
    "    \"\"\"\n",
    "    def __init__(self,threshold=0.003,reservoir_size=50000,model_name='seq2seq',load_path='./models/'):\n",
    "        \n",
    "        super().__init__(threshold=threshold,reservoir_size=reservoir_size,\n",
    "                         model_name=model_name,load_path=load_path)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "MODEL = True"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pick Kubernetes cluster on GCP or Minikube."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "MINIKUBE = True"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if MINIKUBE:\n",
    "    !minikube start --memory 4096 \n",
    "else:\n",
    "    !gcloud container clusters get-credentials standard-cluster-1 --zone europe-west1-b --project seldon-demos"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a cluster-wide cluster-admin role assigned to a service account named “default” in the namespace “kube-system”."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl create clusterrolebinding kube-system-cluster-admin --clusterrole=cluster-admin \\\n",
    "--serviceaccount=kube-system:default"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl create namespace seldon"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add current context details to the configuration file in the seldon namespace."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl config set-context $(kubectl config current-context) --namespace=seldon"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create tiller service account and give it a cluster-wide cluster-admin role."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl -n kube-system create sa tiller\n",
    "!kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller\n",
    "!helm init --service-account tiller"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check deployment rollout status and deploy seldon/spartakus helm charts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl rollout status deploy/tiller-deploy -n kube-system"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!helm install ../../../helm-charts/seldon-core-operator --name seldon-core --set usage_metrics.enabled=true   --namespace seldon-system"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check deployment rollout status for seldon core."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl rollout status deploy/seldon-controller-manager -n seldon-system"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Install Ambassador API gateway"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!helm install stable/ambassador --name ambassador --set crds.keep=false"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!kubectl rollout status deployment.apps/ambassador"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If Minikube used: create docker image for outlier detector inside Minikube using s2i. Besides the transformer image and the demo specific model image, the general model image for the Seq2Seq LSTM outlier detector is also available from Docker Hub as ***seldonio/outlier-s2s-lstm-model:0.1***."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if MINIKUBE & MODEL:\n",
    "    !eval $(minikube docker-env) && \\\n",
    "    s2i build . seldonio/seldon-core-s2i-python3:0.4 seldonio/outlier-s2s-lstm-model-demo:0.1\n",
    "elif MINIKUBE:\n",
    "    !eval $(minikube docker-env) && \\\n",
    "    s2i build . seldonio/seldon-core-s2i-python3:0.4 seldonio/outlier-s2s-lstm-transformer:0.1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Install outlier detector helm charts and set *threshold* and *reservoir_size* hyperparameter values."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if MODEL:\n",
    "    !helm install ../../../helm-charts/seldon-od-model \\\n",
    "        --name outlier-detector \\\n",
    "        --namespace=seldon \\\n",
    "        --set model.type=seq2seq \\\n",
    "        --set model.seq2seq.image.name=seldonio/outlier-s2s-lstm-model-demo:0.1 \\\n",
    "        --set model.seq2seq.threshold=0.002 \\\n",
    "        --set model.seq2seq.reservoir_size=50000 \\\n",
    "        --set oauth.key=oauth-key \\\n",
    "        --set oauth.secret=oauth-secret \\\n",
    "        --set replicas=1\n",
    "else:\n",
    "    !helm install ../../../helm-charts/seldon-od-transformer \\\n",
    "        --name outlier-detector \\\n",
    "        --namespace=seldon \\\n",
    "        --set outlierDetection.enabled=true \\\n",
    "        --set outlierDetection.name=outlier-s2s-lstm \\\n",
    "        --set outlierDetection.type=seq2seq \\\n",
    "        --set outlierDetection.seq2seq.image.name=seldonio/outlier-s2s-lstm-transformer:0.1 \\\n",
    "        --set outlierDetection.seq2seq.threshold=0.002 \\\n",
    "        --set outlierDetection.seq2seq.reservoir_size=50000 \\\n",
    "        --set oauth.key=oauth-key \\\n",
    "        --set oauth.secret=oauth-secret \\\n",
    "        --set model.image.name=seldonio/outlier-s2s-lstm-model:0.1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Port forward Ambassador\n",
    "\n",
    "Run command in terminal:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```\n",
    "kubectl port-forward $(kubectl get pods -n seldon -l app.kubernetes.io/name=ambassador -o jsonpath='{.items[0].metadata.name}') -n seldon 8003:8080\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import rest requests, load data and test requests"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from utils import get_payload, rest_request_ambassador, send_feedback_rest, ecg_data\n",
    "\n",
    "ecg_data, ecg_labels = ecg_data(dataset='TRAIN')\n",
    "X = ecg_data[0,:].reshape(1,ecg_data.shape[1],1)\n",
    "label = ecg_labels[0].reshape(1)\n",
    "print(X.shape)\n",
    "print(label.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Test the rest requests with the generated data. It is important that the order of requests is respected. First we make predictions, then we get the \"true\" labels back using the feedback request. If we do not respect the order and eg keep making predictions without getting the feedback for each prediction, there will be a mismatch between the predicted and \"true\" labels. This will result in errors in the produced metrics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "request = get_payload(X)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "response = rest_request_ambassador(\"outlier-detector\",\"seldon\",request,endpoint=\"localhost:8003\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If the outlier detector is used as a transformer, the output of the anomaly detection is added as part of the metadata. If it is used as a model, we send model feedback to retrieve custom performance metrics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if MODEL:\n",
    "    send_feedback_rest(\"outlier-detector\",\"seldon\",request,response,0,label,endpoint=\"localhost:8003\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Analytics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Install the helm charts for prometheus and the grafana dashboard"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!helm install ../../../helm-charts/seldon-core-analytics --name seldon-core-analytics \\\n",
    "    --set grafana_prom_admin_password=password \\\n",
    "    --set persistence.enabled=false \\\n",
    "    --namespace seldon"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Port forward Grafana dashboard"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run command in terminal:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```\n",
    "kubectl port-forward $(kubectl get pods -n seldon -l app=grafana-prom-server -o jsonpath='{.items[0].metadata.name}') -n seldon 3000:3000\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can then view an analytics dashboard inside the cluster at http://localhost:3000/dashboard/db/prediction-analytics?refresh=5s&orgId=1. Your IP address may be different. get it via minikube ip. Login with:\n",
    "\n",
    "Username : admin\n",
    "\n",
    "password : password (as set when starting seldon-core-analytics above)\n",
    "\n",
    "Import the outlier-detector-s2s-lstm dashboard from ../../../helm-charts/seldon-core-analytics/files/grafana/configs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Run simulation\n",
    "\n",
    "- Sample random ECG from dataset.\n",
    "- Get payload for the observation.\n",
    "- Make a prediction.\n",
    "- Send the \"true\" label with the feedback if the detector is run as a model.\n",
    "\n",
    "It is important that the prediction-feedback order is maintained. Otherwise there will be a mismatch between the predicted and \"true\" labels.\n",
    "\n",
    "View the progress on the grafana \"Outlier Detection\" dashboard. Most metrics need the outlier detector to be run as a model since they need model feedback."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "import time\n",
    "n_requests = 100\n",
    "n_samples, sample_length = ecg_data.shape\n",
    "for i in range(n_requests):\n",
    "    idx = np.random.choice(n_samples)\n",
    "    X = ecg_data[idx,:].reshape(1,sample_length,1)\n",
    "    label = ecg_labels[idx].reshape(1)\n",
    "    request = get_payload(X)\n",
    "    response = rest_request_ambassador(\"outlier-detector\",\"seldon\",request,endpoint=\"localhost:8003\")\n",
    "    if MODEL:\n",
    "        send_feedback_rest(\"outlier-detector\",\"seldon\",request,response,0,label,endpoint=\"localhost:8003\")\n",
    "    time.sleep(1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if MINIKUBE:\n",
    "    !minikube delete"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
